A REVIEW OF CONSISTENCY MECHANISMS FOR QoS AWARE CONTENT DISTRIBUTION NETWORKS

Academic Open Internet Journal

www.acadjournal.com

Volume 14, 2005

A REVIEW OF CONSISTENCY MECHANISMS FOR QoS AWARE CONTENT DISTRIBUTION NETWORKS

The World Wide Web and Internet has witnessed enormous growth for more than a decade and it has been used to serve users with vast amount of information stored at geographically distributed sites. To alleviate the burden of increasing load on the network infrastructure and on web servers, caching has been considered a viable option. The key problem with caching is providing cache consistency. Content Distribution Network (CDN) is a shared network of servers or caches that deliver content to users on behalf of content providers. In this paper we examine various strategies that have been used to provide cache consistency in CDN for improved performance.

Key Words: Web Caching, Content Distribution Networks, Proxy server, Cache Consistency

Large-scale distributed systems often address performance and quality of service issues by way of caching and replication. Caching aims to move content closer to users to help diminish load on origin servers, eliminate redundant data traversal on the network, and reduce user-perceived latency. Traditional caching has limited effectiveness due to diversity of resource access, increasing dynamic content, and concerns about consistency of cached responses. In today’s scenario, each site needs additional mechanism to deliver acceptable performance amidst flash crowds and Distributed Denial of Service attacks.

Content Distribution Networks (CDN) is a mechanism to deliver content to end users on behalf of origin web sites. It offloads work from origin servers by serving some or all of the contents of web pages through several replica servers that are part of CDN. It attempts to overcome drawbacks encountered in traditional caching and provide user with good performance. CDN’s such as Akamai and Digital Island achieve scalability through replication [1]. Documents requested by clients can be generally categorized as Static documents, Dynamic documents and dynamically generated documents. It is the goal of CDN developer to design an efficient scheme for updating documents of any category and maintaining consistency of documents at replicas with the origin server. A caching infrastructure offers better delivery to the client, has lower latency, higher availability and lower load on the network links. The exact cache consistency mechanism and degree of consistency employed by intermediary depends on the nature of cached data, because all types of data need not employ same level of consistency guarantees.

The remainder of this paper is organized as follows. Section 2 and 3 provides an overview of caching and consistency. Section 4 describes schemes for managing consistency of documents. Section 5 provides discussion and concludes the paper.

Caching has proven to be an effective and practical solution for improving the scalability and performance of Web Servers. Proxy caching is a commonly used technique to reduce content access latencies [2,3]. A proxy server (or web cache) is normally placed in between client and servers. Client request will be directed towards the proxy and if it satisfies the request with the locally cached data, it is called cache hit. If proxy contacts the origin server to satisfy the client request, it is called cache miss. Proxy caching normally reduces network bandwidth usage because they are commonly deployed on the edges of networks. Due to enormous growth of World Wide Web, proxy caching alone will not provide a scalable solution. This led to development of Content Distribution Networks (CDN) that provides a scalable and efficient dissemination of data to several clients spread geographically with reduced access latency.

A CDN comprises of Origin Servers at root level, Replica Origin Servers at next level and intermediate proxy servers (caches) at lower level. Normally content will be delivered to client from replica servers via intermediate proxy servers. [Figure 1 - Basic CDN architecture]. To facilitate caching in CDN, the cache needs to implement three fundamental functions:

Methods used for getting the content delivered to the cache for subsequent use are:

In order to allow caches to query other caches within the same CDN for content, Internet Cache Protocol (ICP) [4] is used. When a request for a certain object arrives at a cache, and the content is not available locally, the cache can then use ICP to find that object at other caches.

Caching can be performed for different data types as listed by D.C.Verma [5]: Static content, Dynamic content, Processed Data, Multimedia Streams, Database entries, LDAP (Lightweight Directory Access Protocol) directories and application code. Static data is easier to cache at replica server because it changes infrequently and usage of periodic synchronization will yield good results. Caching Dynamic data requires an efficient mechanism because it changes frequently. It helps in situation in which data is likely to be read many times before it becomes out of date. Caching Processed data is advantageous than caching raw data if same processing request is expected shortly. It reduces the overhead of performing repeated processing at the cache for same kind of request that is occurring frequently.

Multimedia content consists of audio-visual data that is delivered to the client in streaming mode. It allows client to play a segment of file that is downloaded for a fixed time period and at the same time it performs downloading of other segments from the network. In database caching, key challenge is to maintain the highly reliable and consistent database transactional semantics without affecting performance. LDAP directory is an organized data type and poses two differences when compared to database; a) all its entries are organized in hierarchical manner using distinguishing name for each entry b) it allows for very simple set of query and lookup operations. Application code has the characteristic that they change relatively slowly, so it can be cached to replica servers and executed efficiently.

A common problem that is encountered in a caching environment is ensuring that the cache always has the most recent copy of the requested content in cache (Freshness). The goal is achieved by labeling the content with parameters that informs cache when they need to refresh the content from the original content source. The cache can also be programmed to check the content on a regular basis.

In case of distributed caches, R.Tewari et al [6] suggests four design principles. They are: a) Minimize number of hops to locate and access data b) Not to slow down misses c) share data among many caches d) Cache data close to clients. To achieve the design principles mentioned above various strategies are considered: 1) separation of data paths from metadata paths and creation of metadata hierarchy to track location of data storage 2) caches locate nearby copies with reduced network latency by maintaining location hints 3) avoid store and forward delays by using direct cache-to-cache data transfers 4) push caching used to improve hit and miss response times.

A scalable architecture for cooperative web caching have been proposed by B.Ciciani et al in [7]. It uses multi-tier architecture where cache servers are distinguished as near and far servers. First level cooperation is called as intra-cluster where servers within single cluster cooperate and it uses Cache Digest Protocol. Second level cooperation is called as inter-cluster where communication occurs among master caches of multiple clusters and it uses cache-master protocol (CMP).

Consistency for CDN caches is implemented by selecting appropriate consistency models that uses various consistency policies and content distribution mechanisms G.Pierre et al [8]. Consistency model is basically a contract between content delivery system and its clients that dictates the consistency-related properties of the content. Consistency policy defines how, when, and to which object replicas the content distribution mechanisms are applied. Replica servers exchange replica updates using content distribution mechanisms.

Consistency model vary based on their strictness of enforcing consistency. In Strong consistency the system guarantees that all replicas are identical from the perspective of clients. Delta Consistency returns data that is never outdated by more than δ time units. Weak consistency, in turn, allows replicas to differ, but ensures that all updates reach all replicas after some bound time. Mutual consistency ensures that group of objects are mutually consistent with each other. Consistency models define consistency along three different axes: time, value and order. Time based consistency ensures that, an update to replica at time t is visible to other replicas and clients before time t+∆. Value based consistency ensures that, difference between the value of a replica and that of other replicas is no greater than a certain ∆. Order based consistency is generally exploited in replicated databases.

Content Distribution mechanisms are classified based on two different aspects: Update forms and direction in which updates are triggered. Replica updates can be carried out in three different forms as suggested by G.Pierre et al [8]:

The update transfer can be initiated either by a replica server that is in need for new version (pull method) or by replica server that holds a new replica version (push method). In pull – based approach it uses either Time to Refresh (TTR) attribute or if-modified-since field of HTTP. The push-based scheme ensures that communication occurs only when there is an update. It can meet strong consistency requirements without introducing communication overhead. Important constraint of push-based scheme is that the object home server needs to keep track of all replica servers to be informed.

To maintain the consistency of data distributed across multiple sites, several schemes are considered. The choice of correct scheme depends upon the nature of data that is replicated. Schemes suggested by D.C.Verma [5] are:

A.Iyengar et al [9] broadly categorizes Consistency provisioning as: Server Driven, Client Driven and Explicit mechanisms. In client- driven approach, intermediaries poll the server on every read to determine if the data has changed. It imposes large message overhead and also increases the response time. Advantage here will be there is no need for the server to maintain any state of proxies holding the data. In server-driven approach it has to maintain state space but reduces control message overhead.

In Server-driven invalidation, an invalidation message is sent to all replicas when a document is changed at the origin server. Here, each replica needs to fetch an updated version individually at a later time, which leads to inefficient usage of the distribution network for content delivery and inefficiency in managing consistency at replicas.

H.Yu et al [10] proposes architecture that uses caching hierarchy and application level multicast routing to convey invalidations. It forms multicast group among caches and sends heartbeat message to each other. Caches maintain server table to locate where in hierarchy the servers are attached. M.Dahlin et al [11] proposes Web Cache Invalidation Protocol (WCIP) for propagating server invalidations using application-level multicast while providing delta consistency.

In the propagation approach the updated version of a document is delivered to all replicas whenever a change is made to the document at the origin server. It may generate significant levels of unnecessary traffic if documents are updated more frequently than accessed. Z.Fei [12] suggests a novel hybrid approach that generates less traffic than propagation and the invalidation approach. The origin server makes the decision of using either propagation or invalidation method for each document based on the statistics about the update frequency at the origin server and the request rates collected by replicas. It is based on the relative value of T_p(Propagation Traffic)and T_i(Invalidation Traffic)_.Propagation will be used if T_p< T_iand invalidation is used if T_p>T_i. It explores the effects of request rate, average inter-update time and the number of replicas.

Cache consistency is achieved through a protocol called WCDP [13] (Web Content Distribution protocol). It is an Invalidation and Update protocol with different consistency levels like Strong Consistency (for mirror sites), Delta Consistency (for intermediaries / proxies placed in between replica server and clients), Weak consistency and explicit consistency. It supports atomic invalidates and mutual consistency among objects. Scalability is achieved by grouping objects and messages together. Hierarchical organization is used for message delivery. Authorization and authentication for different security levels are also supported. WCDP operates between the origin server, mirror sites, and the participating web intermediaries. It is not targeted for inter-CDN operations.

An approach that provides smooth tradeoff between state space overhead and number of control messages exchanged is leases, given by C.Gray & D.Cheriton [14]. Consistency is achieved by associating leases with each object that get replicated to several replica servers that is distributed geographically. Lease is basically time period over which replica server registers its interest in a particular object and is valid until lease time expires. During the lease time, the object home server pushes all updates of the object to the replica server. Replica server can either update lease or ignore it after the lease expires. V.Duvvuri et al [15] suggests Leases can be divided into three groups: age-based, renewal-frequency-based and load based ones. In age-based leases, the lease time depends on the last time the object was modified. In renewal-frequency-based leases, the object home server gives longer leases to replica servers that ask for replica validation more often. In load based leases, the object home server tends to give away shorter lease times when it becomes overloaded. Leases encounter two drawbacks when it is considered for CDN: a) It provides strong consistency for all types of objects b) Server needs to maintain state for each proxy caching an object.

In order to provide scalability and flexibility for large number of proxies in CDN and to overcome drawbacks encountered in strategies designed for stand-alone proxies, A.Ninan et al [16, 17] proposes technique called Cooperative consistency along with mechanism of Cooperative leases. Using ∆–Consistency semantics and single lease for multiple proxies it can be applied in a scalable manner to CDN. The cached object normally requires different levels of consistency guarantees based on their characteristics and user preferences. ∆–Consistency requires that a cached version of an object is never out of date by more than ∆ time units with its server version. Proxies within CDN are partitioned in to non-overlapping groups called regions, and within region proxies cooperate to maintain consistency of cached objects. A leader is selected among proxies of each region and notification from replica server will be sent only to the leader. The member proxies can update their object state from leader by either invalidation or propagation approach. Leader selection is done in two ways: a) Proxy that receives request for an object within the region for the first time becomes leader b) Hashing function employed to determine leader of object. For renewal of leases, two approaches are considered: Eager renewals and Lazy renewals. When number of proxies within region is enormous it is recommended to go for multi-level hierarchy. CDN proxies can also be organized in to multiple regions that provide good scalability using cooperative leases.

S.Sivasubramanian et al [18] proposes system that guarantees strong consistency for web applications with high scalability. It uses unique mechanism of Partial Replication (also called as On-Demand replication) where data units are replicated only to servers that access them often. Segmentation of data in to data units and replication is performed automatically based on the access pattern. Clients are redirected to close replica using DNS based redirection. Based on application’s access pattern, the system clusters data units and decide on the assignment of replication strategy for each cluster. Each application is assigned one Origin server, which is responsible for making all application-wide decisions such as clustering data units and placing clusters on servers. It adopts master-slave consistency protocol to support replication-transparent application model. Replica placement and master selection is performed by applying heuristics. Average Read latency (ARL), Average Write latency (AWL) and Number of consistency messages (NCM) are the metrics considered for measuring overall system performance.

In order to provide Strong Consistency for web caches a protocol called BTC (Basis Token Consistency) is proposed by Adam D.Bradley & Azer Bestavros [19]. It is basically a backward-compatible and transparently interoperable extension to the HTTP protocol, which enables caches to maintain a completely consistent view of the server without requiring out of- band communications or per-client server state. Consistency here refers to property of entities served from single logical cache, such that it always serves entity with state having a value higher than one served for previous request. The protocol has following properties: (a) Strong point-to-point consistency without relying on intermediary cooperation. (b) No per-client state is required at the server or proxy caches. (c) Invalidations are naturally aggregated in semantically meaningful ways. (d) Invalidation is driven by web applications, not heuristics. (e) The necessary data is transmitted only in related responses; hence out-of-band and mixed-channel messages are not required. It requires explicit cooperation of server applications and moderate increase in cache state.

An approach that uses hierarchical structure for server-driven cache consistency has been proposed by J.Yin et al [20]. It holds following benefits: reduced client latency, improved network efficiency, improved server scalability. To cope with problems of independent node failure and increased latency in static hierarchy, it provides mechanisms of split and join that are used to improve efficiency without breaking guarantee of strong consistency. It also emphasizes the usage of dynamic hierarchy where node makes decision of joining a particular parent based on the following details: a) number of lease requests it receives b) Fraction of requests it can satisfy locally during time T.

Web caching and content distribution mainly focuses on reducing latency (client-centric) and saving bandwidth (network-centric) which is achieved by considering performance metric (Cache Hit rate). Michael Feldman & John Chuang [21] deals with service differentiation by focusing on object –centric or publisher-centric issue. It introduces new metric –object lifetime (amount of time object resides in cache before purging). The scheme uses Multi level-LRU (ML-LRU) where each level (e.g. Premium, Best Effort) holds objects that have been classified based on some well known classification policies. Consistency among objects is achieved based on the object lifetime and differential service strategy that is implemented.

Normally in distributed services user is forced to select either strong consistency or optimal / weak consistency. Haifeng Yu & Amin Vahdat [22] proposes continuous consistency model, which explores semantic space between strong and optimistic consistency guarantees. In order to capture consistency spectrum it defines three metrics: Numerical Error, Order Error and Staleness. Using the metrics, arbitrary consistency bounds among replicas can be enforced by designing and implementing a middleware layer. Applications are allowed to specify the maximum probability of inconsistent access in exchange for increased performance.

Due to increased demand for high performance data dissemination in Internet scenario, providing consistency among objects replicated at several sites is of utmost importance. Applications may require different degree of consistency and based on the requirement specification appropriate mechanism needs to be designed in order to implement consistency. Depending upon who is responsible for enforcing consistency it may be classified as server-driven and client-driven consistency. Several research works has been carried out to achieve cache consistency and most of the work concentrates on providing consistency for static and dynamic documents.

In recent years, due to growing user needs and high performance application development, Strong Consistency is desired for its improved performance. Several existing work mainly focuses on time-based consistency policies, ignoring value and order based consistency. Applications that use technologies like ASPs and CGIs that dynamically generate web objects is increasingly used nowadays to provide web content.

Replicating a dynamic web object requires replicating both the code and the data that the code acts upon. Consistency for such applications has not been focused much and lot of research work can be carried out to provide a scalable solution. Similarly, multimedia applications are contributing much to the web traffic, and providing scalable consistency mechanisms for them is also an area that needs to be considered.

In this paper, we have presented an overview of caching and consistency that can be employed for Content Distribution Networks. We have also described and analyzed various strategies for implementing cache consistency.

[2] Jianliang Xu, Jiangchuan Liu, Bo Li, Xiaohua Jia, “Caching and Prefetching for Web Content Distribution”, Computing in Science & Engineering, July- August 2004, Pages 54-59,

[3] M.Raunak, P.Shenoy, P.Goyal, K.Ramamirtham, P.Kulkarni, “Implications of Proxy Caching

for Provisioning Networks and Servers”, Volume 28 Issue 1 Special Issue on proceedings

[4] I. Cooper, I.Melve, G.Tomlinson, “Internet Web Replication and Caching Taxonomy”, Internet

[5] D.C. Verma, Content Distribution Networks: An Engineering Approach. John Wiley & sons,

[6] R.Tewari, M.Dahlin, H.M.Vin, J.S.Kay, “Beyond Hierarchies: Design Considerations for

Distributed Caching on the Internet”, UTCS Technical Report: TR98-04, Department of

[7] R.Lancellotti, B.Ciciani, M.Colajanni, “A Scalable architecture for Cooperative Web

Caching”, Proceedings of Workshop in Web Engineering, Networking 2002, Pisa,

[8] S. Sivasubramanian, M.Szymaniak, G.Pierre, Marteen Van Steen, “Web Replica Hosting

System Design”, Internal Report IR-CS-001, Dept. of Computer Science, Vrije Universiteit,

[9] A.Iyengar, E.Nahum, A.Shaikh, R.Tewari “Enhancing Web Performance” Proceedings of the

IFIP 17^th World Computer Congress – TC6 Stream on Communication Systems: the State

[10] H.Yu, L.Breslau, and S.Shenker. “A Scalable Web Cache Consistency Architecture” In

[11] D.Li, P.Cao and M.Dahlin “WCIP: Web Cache Invalidation Protocol”, IETF Internet Draft,

[12] Z.Fei. “A Novel Approach to Managing Consistency in Content Distribution Networks” In

Proceedings of the 6^th Workshop on Web Caching and Content Distribution, Boston, MA,

[13] R. Tewari, T. Niranajan, and S. Ramamurthy, “WCDP: Web content distribution protocol”

[14] C. Gray and D.Cheriton. “Leases: An Efficient Fault-Tolerant Mechanism for Distributed File

Cache Consistency” In Proceedings of the Twelfth ACM Symposium on Operating

[15] V. Duvvuri, P.Shenoy, and R.Tewari, “Adaptive Leases: A Strong Consistency Mechanism

for the World Wide Web”, In Proceedings of the IEEE Infocom `00, TelAviv, Israel, March

[16] A.Ninan. “Maintaining Cache Consistency in Content Distribution Networks” Master’s

[17] A. Ninan, P. Kulkarni, P. Shenoy, K. Ramamritham, and R. Tewari, “Cooperative leases:

Scalable consistency maintenance in content distribution networks,” in WWW2002,

[18] S.Sivasubramanian, G. Pierre, Maarten van Steen, “Scalable Strong Consistency for Web

Applications”, In Proceedings of 11^th ACM SIGOPS European Workshop, Leuven,

[19] Adam D. Bradley and Azer Bestavros, “Basis Token Consistency: Supporting Strong Web

[20] J.Yin, L.Alvisi, Mike Dahlin, Calvin Lin, “Hierarchical Cache Consistency in WAN”, In

[21] Michal Feldman, John Chuang, “Service Differentiation in Web Caching and Content

Distribution”, Proceedings of IASTED International Conference on Communications

[22] Haifeng Yu and Amin Vahdat, “Design and Evaluation of a Continuous Consistency Model

for Replicated Services”, In Proceedings of the Fourth Symposium on Operating